Reduced Space and Faster Convergence in Imperfect-Information Games via Regret-Based Pruning
نویسندگان
چکیده
Counterfactual Regret Minimization (CFR) is the most popular iterative algorithm for solving zero-sum imperfect-information games. Regret-Based Pruning (RBP) is an improvement that allows poorly-performing actions to be temporarily pruned, thus speeding up CFR. We introduce Total RBP, a new form of RBP that reduces the space requirements of CFR as actions are pruned. We prove that in zero-sum games it asymptotically prunes any action that is not part of a best response to some Nash equilibrium. This leads to provably faster convergence and lower space requirements. Experiments show that Total RBP results in an order of magnitude reduction in space, and the reduction factor increases with game size.
منابع مشابه
Reduced Space and Faster Convergence in Imperfect-Information Games via Pruning
Iterative algorithms such as Counterfactual Regret Minimization (CFR) are the most popular way to solve large zero-sum imperfect-information games. In this paper we introduce Best-Response Pruning (BRP), an improvement to iterative algorithms such as CFR that allows poorly-performing actions to be temporarily pruned. We prove that when using CFR in zero-sum games, adding BRP will asymptotically...
متن کاملDynamic Thresholding and Pruning for Regret Minimization
Regret minimization is widely used in determining strategies for imperfect-information games and in online learning. In large games, computing the regrets associated with a single iteration can be slow. For this reason, pruning – in which parts of the decision tree are not traversed in every iteration – has emerged as an essential method for speeding up iterations in large games. The ability to...
متن کاملRegret-Based Pruning in Extensive-Form Games
Counterfactual Regret Minimization (CFR) is a leading algorithm for finding a Nash equilibrium in large zero-sum imperfect-information games. CFR is an iterative algorithm that repeatedly traverses the game tree, updating regrets at each information set. We introduce an improvement to CFR that prunes any path of play in the tree, and its descendants, that has negative regret. It revisits that s...
متن کاملMonte Carlo Sampling for Regret Minimization in Extensive Games
Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome samplin...
متن کاملA Pruning Algorithm for Imperfect Information Games
IMP-minimax is the analog to minimax for games with imperfect information, like card games such as bridge or poker. It computes an optimal strategy for the game if the game has a single player and a certain natural property called perfect recall. IMP-minimax is described fully in a companion paper in this proceedings. Here we introduce an algorithm IMP-alpha-beta that is to IMP-minimax as alpha...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1609.03234 شماره
صفحات -
تاریخ انتشار 2016